Regularization and the small-ball method II: complexity dependent error rates
نویسندگان
چکیده
For a convex class of functions F , a regularization functions Ψ(·) and given the random data (Xi, Yi) N i=1, we study estimation properties of regularization procedures of the form f̂ ∈ argmin f∈F ( 1 N N ∑ i=1 ( Yi − f(Xi) )2 + λΨ(f) ) for some well chosen regularization parameter λ. We obtain bounds on the L2 estimation error rate that depend on the complexity of the “true model” F ∗ := {f ∈ F : Ψ(f) ≤ Ψ(f∗)}, where f∗ ∈ argminf∈F E(Y −f(X)) and the (Xi, Yi)’s are independent and distributed as (X,Y ). Our estimate holds under weak stochastic assumptions – one of which being a small-ball condition satisfied by F – and for rather flexible choices of regularization functions Ψ(·). Moreover, the result holds in the learning theory framework: we do not assume any a-priori connection between the output Y and the input X). As a proof of concept, we apply our general estimation bound to various choices of Ψ, for example, the `p and Sp-norms (for p ≥ 1), weak-`p, atomic norms, max-norm and SLOPE. In many cases, the estimation rate almost coincides with the minimax rate in the class F ∗.
منابع مشابه
کنترل بهینة شار حرارتی سطحی در یک جسم دوبعدی با خواص وابسته به دما
In this paper the optimal control of boundary heat flux in a 2-D solid body with an arbitrary shape is performed in order to achieve the desired temperature distribution at a given time interval. The boundary of the body is subdivided into a number of components. On each component a time-dependent heat flux is applied which is independent of the others. Since the thermophysical properties are t...
متن کاملMaterial to Regularization and the Small - Ball Method I : Sparse Recovery
An inspection of Theorem 3.2 reveals no mention of an isotropicity assumption. There is no choice of a Euclidean structure, and in fact, the statement itself is not even finite dimensional. All that isotropicity has been used for was to bound the “complexity function” r(·) and the “sparsity function” ∆(·) in the three applications — the LASSO (in Theorem 1.4), SLOPE (in Theorem 1.6) and the tra...
متن کاملGraphConnect: A Regularization Framework for Neural Networks
Deep neural networks have proved very successful in domains where large training sets are available, but when the number of training samples is small, their performance suffers from overfitting. Prior methods of reducing overfitting such as weight decay, Dropout and DropConnect are data-independent. This paper proposes a new method, GraphConnect, that is data-dependent, and is motivated by the ...
متن کاملRegularization via Mass Transportation
The goal of regression and classification methods in supervised learning is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce training data, overfitting is typically mitigated by adding regularization terms to the objective that penalize hypothesis complexity. In this paper we intr...
متن کاملMinimizing the total tardiness and makespan in an open shop scheduling problem with sequence-dependent setup times
We consider an open shop scheduling problem with setup and processing times separately such that not only the setup times are dependent on the machines, but also they are dependent on the sequence of jobs that should be processed on a machine. A novel bi-objective mathematical programming is designed in order to minimize the total tardiness and the makespan. Among several mult...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 18 شماره
صفحات -
تاریخ انتشار 2017